University of San Francisco, MSMI-603
2022-12-31
So far, we have learned about:
(among other things)
We’ve learned to say things like:
clicking
between group A and B is 2%
income
, customers spend $25 more in our stores
Have we learned to say things like:
clicking
between group A and B is 2%
income
, customers spend $25 more in our stores
No!
For the clearest example, let’s focus on the third:
Effect sizes put our results into a standard format.
Effect sizes put our results into a standard format.
There are two kinds of effect sizes, broadly:
Effect sizes put our results into a standard format.
We will learn two today
\(\frac{(M_A - M_B)}{SD_{AB}}\)
This tells us how large the difference between groups is in terms of total variance in the data.
Heights of men and women in the US:
Are men and women different heights on average?
Heights of men and women in the US:
Heights of men and women in the US:
Are people are more aggressive toward individuals who have provoked them?
Are people are more aggressive toward individuals who have provoked them?
Are people are more aggressive toward individuals who have provoked them?
Are people who are seen as more credible are also more persuasive?
Are people who are seen as more credible are also more persuasive?
Are people who are seen as more credible are also more persuasive?
Sometimes you can look to other research
Sometimes you cannot
Hypothesis: Mode of ordering (smartphone vs. desktop) will influence people’s portion choices
\(Portion Size = \beta_{Device}xDevice + \beta_{Hunger}Hunger + \beta_{Dieting}Dieting\)
Hypothesis: Mode of ordering (smartphone vs. desktop) will influence people’s portion choices
\(Portion Size = \beta_{Device}xDevice + \beta_{Hunger}Hunger + \beta_{Dieting}Dieting\)
…And use common sense…
\(1 - \frac{SSR}{n - p - 1} \div \frac{SST}{n - 1}\)
This tells you:
lm()
anova()
From anova()
customerData <- read.csv('customerData.csv')
m_1 <- lm( data = customerData, sat.service ~ 1) # Just the mean
m_2 <- lm( data = customerData, sat.service ~ email) # Effect of email
m_3 <- lm( data = customerData, sat.service ~ email + income) # Effect of email and income
anova(m_1, m_2, m_3)
Analysis of Variance Table
Model 1: sat.service ~ 1
Model 2: sat.service ~ email
Model 3: sat.service ~ email + income
Res.Df RSS Df Sum of Sq F Pr(>F)
1 590 1187.70
2 589 1179.40 1 8.30 5.9544 0.01497 *
3 588 819.51 1 359.89 258.2261 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From anova()
Analysis of Variance Table
Model 1: sat.service ~ 1
Model 2: sat.service ~ email
Model 3: sat.service ~ email + income
Res.Df RSS Df Sum of Sq F Pr(>F)
1 590 1187.70
2 589 1179.40 1 8.30 5.9544 0.01497 *
3 588 819.51 1 359.89 258.2261 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From lm()
Call:
lm(formula = sat.service ~ email, data = customerData)
Residuals:
Min 1Q Median 3Q Max
-3.9813 -0.7347 0.0187 1.0187 4.0187
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.98131 0.09673 41.159 <2e-16 ***
emailyes -0.24656 0.12111 -2.036 0.0422 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.415 on 589 degrees of freedom
(409 observations deleted due to missingness)
Multiple R-squared: 0.006987, Adjusted R-squared: 0.005301
F-statistic: 4.144 on 1 and 589 DF, p-value: 0.04222